Introduction
Stefan Th. Gries
Computer simulations of language change notes
This website collects my personal notes on Computer simulations of language change. These notes are provided to bring full transparency to my research process. Of course, since they are only notes, they do not reflect my final thoughts on a topic, and should not be interpreted as such. To read finished papers, please consult my website. Do not use these notes as a basis for your own scientific research. Start from high-quality, peer-reviewed scientific literature instead.
Stefan Th. Gries
Nick C. Ellis
Summary
Ellis discusses the interrelation of frequency and cognition – in cognition in general as well as in (second) language cognition – and, most importantly given current discussions in usage-based approaches to language, provides a detailed account of the factors that drive the kind of associative learning assumed by many in the field: type and token frequency, Zipfian distributions as well as recency, salience, perception, redundancy etc. Just as importantly, Ellis derives a variety of conclusions or implications of these factors for our modeling of learning and acquisition processes, which sets the stage for the papers in this volume.
three major experiential factors that affect cognition
↓
percept
categorisation
↓
fuzzy natural categories
[S]on may be like mother, and mother like sister, but in a very different way. And we learn about these families, like our own, from experience. Exemplars are similar if they have many features in common and few distinctive attributes (features belonging to one but not the other); the more similar are two objects on these quantitative grounds, the faster are people at judging them to be similar (Tversky 1977).
prototypes
Prototypes are judged faster and more accurately, even if they themselves have never been seen before […] Such effects make it very clear that although people don’t go around consciously counting features, they nevertheless have very accurate knowledge of the underlying frequency distributions and their central tendencies.
Language processing is very sensitive to usage frequency
↓
at all levels of language representation!
↳ (Ellis 2002)
Language knowledge involves statistical knowledge, so humans learn more easily and process more fluently high frequency forms and ‘regular’ patterns which are exemplified by many types and which have few competitors.
language learning according to psycholinguists
the rules of language
usage-based linguistics
constructions
Goldberg’s (2006) Construction Grammar argues that all grammatical phenomena can be understood as learned pairings of form (from morphemes, words, idioms, to partially lexically filled and fully general phrasal patterns) and their associated semantic or discourse functions: ‘‘the network of constructions captures our grammatical knowledge in toto, i.e. It’s constructions all the way down’’
skipped
if
constructions
↓ then…
language acquisition
↳ how?
Determinants of learning
1. input frequency
2. form
3. function
4. interactions between all
frequency of exposure
↓
phonology and phonotactics, reading, spelling, lexis, morphosyntax, formulaic language, language comprehension, grammaticality, sentence production, syntax
↳ sensitivity to input frequencies
| token frequency | type frequency |
|---|---|
| how often a particular form appears in the input | the number of distinct lexical items that can be substituted in a given slot in a construction* |
* can be both a word-level construction for inflection or a syntactic construction
type
(so, in conclusion, it’s just cognitively advantageous, and thus easy to defend this position)
token
Irregular forms only survive because they are high frequency.
Zipf’s law (1949)
learning categories from exemplars
recency effects / priming
↓
syntactic priming
This behavior has been observed when speakers hear, speak, read or write sentences (Bock 1986; Pickering 2006; Pickering and Garrod 2006).
salience
↳ low salience cues
Many grammatical meaning-form relationships, particularly those that are notoriously difficult for second language learners like grammatical particles and inflections such as the third person singular -s of English, are of low salience in the language stream. For example, some forms are more salient: ‘today’ is a stronger psychophysical form in the input than is the morpheme ‘-s’ marking 3rd person singular present tense, thus while both provide cues to present time, today is much more likely to be perceived, and -s can thus become overshadowed and blocked, making it difficult for second language learners of English to acquire (Ellis 2006, 2008; Goldschneider and DeKeyser 2001).
central category members
↓
prototype
↳ token frequency
redundant cues
Not only are many grammatical meaning-form relationships low in salience, but they can also be redundant in the understanding of the meaning of an utterance. For example, it is often unnecessary to interpret inflections marking grammatical meanings such as tense because they are usually accompanied by adverbs that indicate the temporal reference. Second language learners’ reliance upon adverbial over inflectional cues to tense has been extensively documented […]
contingency of mapping (Shanks 1995)
Consider how, in the learning of the category of birds, while eyes and wings are equally frequently experienced features in the exemplars, it is wings which are distinctive in di¤erentiating birds from other animals. Wings are important features to learning the category of birds because they are reliably associated with class membership, eyes are neither. Raw frequency of occurrence is less important than the contingency between cue and interpretation.
[W]hat we really want is a model of usage and its effects upon acquisition. We can measure these factors individually. But such counts are vague indicators of how the demands of human interaction affect the content and ongoing co-adaptation of discourse, how this is perceived and interpreted, how usage episodes are assimilated into the learner’s system, and how the system reacts accordingly. We need theoretical models of learning, development, and emergence that takes these factors into account dynamically.
(skipped)
Not everything that we can count in language counts in language cognition and acquisition
If it did, the English articles the and a alongside frequent morphological inflections would be among the first learned English constructions, rather than the most problematic in L2A.
associative learning affected by…
1. factors relating to the form
2. factors relating to learner attention
emergentism
agents everywhere
[M]ore recently, work within Emergentism, Complex Adaptive Systems (CAS), and Dynamic Systems Theory (DST) has started to describe a number of scale-free, domain-general processes which characterize the emergence of pattern across the physical, natural, and social world
Emergentism and Complexity Theory (MacWhinney 1999; Ellis 1998; Elman et al. 1996; Larsen-Freeman 1997; Larsen-Freeman and Cameron 2008; Ellis and Larsen-Freeman 2009, 2006)
idea
‘‘Emergentists believe that simple learning mechanisms […] suffice to drive the emergence of complex language representations.’’ (Ellis 1998, p. 657)
Language considered as a CAS of dynamic usage and its experience involves the following key features
↓
advantage of CAS
Much of CAS research investigates these interactions through the use of computer simulations (Ellis and Larsen-Freeman 2009).
Principle of Least Effort (Zipf 1949)
It has become a hallmark of Complex Systems theory where so-called fat-tailed distributions characterize phenomena at the edge of chaos, at a self-organized criticality phase-transition point midway between stable and chaotic domains.
Language usage, social roles, language learning, and conscious experience are all socially situated, negotiated, sca¤olded, and guided. They emerge in the dynamic play of social intercourse. All these factors conspire dynamically in the acquisition and use of any linguistic construction. The future lies in trying to understand the component dynamic interactions at all levels, and the consequent emergence of the complex adaptive system of language itself.
William D. Raymond and Esther L. Brown
Summary
Raymond & Brown explore a range of frequency-related factors and their impact on initial fricative reduction in Spanish. They begin by pointing out that results of previous studies have been inconclusive, in part because many different studies have included only partially overlapping predictors and controls; in addition, the exact causal nature of frequency effects has also proven elusive. They then study data on [s]-initial Spanish words from the free conversations from the New Mexico-Colorado Spanish Survey, a database of interviews and free conversations initiated in 1991. A large number of different frequency-related variables is coded for each instance of an s-word, including word frequency, bigram frequency, transitional probability (in both directions), and others, and these are entered into a binary logistic regression to try to predict fricative reduction.
The results show that s-reduction is influenced by many predictors, too many to discuss here in detail. However, one very interesting conclusion is that, once a variety of contextual frequency measures is taken into consideration, then non-contextual measures did not contribute much to the regression model anymore, which is interesting since it forces us to re-evaluate our stance on frequency, from a pure repetition-based view to a more contextually-informed one, which in itself would constitute a huge conceptual development (cf. also below).
word frequency
↓
word frequency and reduction
Investigations of the processes of sound change in language going back over a century have noted that more frequent words are shorter and change more quickly than less frequent words (Schuchart 1885; Zipf 1929).
In studies of synchronic pronunciation variation, evidence has been offered that higher word frequency is associated with more word reduction in speech production, as measured by both categorical measures of segment reduction or deletion (Bybee 2001, 2002; Krug 1998; Jurafsky et al. 2001; Raymond, Dautricourt, and Hume 2006) and also continuous measures of reduction, including durational shortening (Gahl 2008; Jurafsky et al. 2001; Pluymaekers, Ernestus, and Baayen 2005) and some acoustic parameters (Ernestus et al. 2006; Myers and Li 2007).
↓
how does frequency of word use contribute to the reductive process?
other factors
[H]owever, frequency effects are not ubiquitous.
↓ how is this possible?
1. methodological differences
↓
2. likelihood
automation
↓ ??
nature of reduction
↓ why?
influence of lexical structure and discourse environments
In the current study whether word frequency plays an independent role in on-line s- reduction is addressed by controlling both word frequency and frequency of occurrence of a word in phonological environments known to promote articulatory reduction of [s-]. The effects of other probabilistic measures are also assessed, to determine whether they contribute to s- reduction. Both intra- and extra-lexical phonological contexts are controlled, and comparison of their effects is used to determine to what extent reduction can be attributed to lexical representations or on-line articulatory processes.
As an illustration of the measures in Table 2, consider the excerpt from the corpus transcription in (1).
The s- word sobrino in the token in (1) occurs 11 times in the NMCOSS corpus, giving it a frequency per million of 146 and a log frequency of 2.17. The preceding word bigram in this token is mi sobrino, which has a frequency in the corpus statistics of 6, and the frequency of the word preceding the s- word, mi, is 485, so that the predictability of sobrino from mi is 6/485 = .0124. The s- of sobrino in this token is followed (word-internally) by the non-high vowel /o/, which is a context hypothesized to favor s- reduction. However, the vowel preceding s- is the high vowel /i/ in mi, which is hypothesized not to favor reduction. Overall in the corpus the word sobrino occurs after a non-high vowel (/o/, /a/, or /e/) only once, giving sobrino a FFC of 1/11 = .091. The frequency with which /i/ precedes /s/ at a word boundary in the corpus is 407, and the log of this frequency per million phones is 2.61.
reduction effects
The failure to find any robust effects of the non-contextual word and phone unit probabilities after controlling the contextual variables suggests that speakers are sensitive to how often a word occurs in environments that encourage reduction, but not measurably to non-contextual probabilistic measures of use. Consequently, an s- word’s frequency did not predict /s-/ reduction.
How can the failure to find a significant effect of word frequency on s-reduction in datasets analyzed be reconciled with other studies, in which word frequency effects on a range of reductive processes have been reported?
Gunther De Vogelaer
Summary
De Vogelaer studies the gender systems of Dutch dialects. More specifically, he starts out from the fact that Standard Dutch exhibits a gender mismatch of the binary article system and the ternary pronominal system and explores to what degree this historical change is affected by frequency effects. Results from a questionnaire study, in which subjects were put in a position to decide on the gender of nouns, indicate high- and low-frequency items behave differently: the former are affected in particular by standardization whereas the latter are influenced more by resemanticization. However, the study also cautions us that different types of data can yield very different results with regard to the effect of frequency. De Vogelaer compares frequency data from the 9-million-word Spoken Dutch Corpus to age-of-acquisition data from a target vocabulary list. Correlation coefficients indicate that the process of standardization is more correlated with the adult spoken corpus frequencies whereas resemanticization is more correlated with the age-of-acquisition data. As De Vogelaer puts it, ‘‘frequency effects are typically poly-interpretable,’’ and he rightly advises readers to regularly explore different frequency measures and register-specific frequencies.
Present-day Standard Dutch differs from historical varieties of the language in that the difference between the marking of masculine and feminine gender is levelled out
Standard Dutch
↳ mismatch!
↳ consequence
[W]hile pronominal gender traditionally matched the grammatical gender of the antecedent inanimate noun, northern varieties of Dutch, including Standard Dutch as spoken in the Netherlands, seem to be shifting towards a semantic system of pronominal gender, operating along the lines of the Individuation Hierarchy (Siemund 2002; Audring 2006, 2009): highly individuated nouns (including neuter words such as masker ‘mask’ and apparaat ‘device’; cf. Audring 2009: 86) increasingly trigger the use of masculine pronouns such as hij ‘he’ or hem ‘him’, weakly individuated ones (including common nouns such as spinazie ‘spinach’ and wol ‘wool’; cf. Audring 2009: 98) combine with neuter het ‘it’.
↓
| highly individuated nouns | weakly individuated nouns |
|---|---|
| e.g. masker, apparaat | e.g. spinazie, wol |
| masculine pronouns | neuter pronouns |
The pronominal gender stands completely separate from grammatical gender.
This chapter focuses on pronominal gender in a number of varieties of Dutch in which the grammatical gender system still stands strong, more specifically on West and East Flemish dialects. In these dialects any instances of semantically-motivated pronouns are highly ambiguous with respect to the mechanism of language change explaining them: these instances may exemplify ongoing change within these varieties, but they may also be adopted from varieties of Dutch in which semantic agreement occurs more often.
In addition, not all changes in the choice of a pronoun referring to an antecedent noun are due to resemanticisation. Apart from resemanticisation there is also variation in that many nouns have a different gender in the traditional dialects than in the standard language. In more recent times, extensive levelling is causing these dialects to converge to Standard Dutch, so it is likely that many nouns having a different gender in the dialect than in Standard Dutch are under pressure to switch gender.
frequency effects
One wellknown hypothesis regarding frequency is that conservative features in language are preserved longer in high frequency items (see, e.g., Bybee and Hopper 2001: 17–18; Corbett, Hippisley, Brown, and Marriot 2001; Smith 2001)
Reason for conservation
According to Phillips (2006: 87), this characterisation holds for all changes that are implemented in cases ‘when memory fails’, for instance in sound changes affecting words of which the phonetic word form is not well entrenched in memory, which drives speakers to choose pronunciations motivated by surface phonetics, pronunciations analogous to other patterns in the language, or, in general terms, innovations requiring ‘‘access to generalisations that have emerged from word forms’’ (Phillips 2006: 157). Changes directly involving the production of word forms, however, affect the most frequent words first (e.g. deletion, assimilation, . . .).
Intra and extradialectal innovation
high frequency = probability of change
↕
high frequency = resistant to change
↓ reconciliation (Phillips 2006)
| ideologically motivated changes | ideologically free changes | |
|---|---|---|
| high-frequency items | low-frequency items | |
| typically affect words from a certain register (e.g. formal or rather informal vocabulary) |
behave as changes emerging within a speech community | |
| changes directly involving production of word forms | changes being implemented ‘as memory fails’ | |
(last distinction by Philips 2006 : 157)
Phillips (2006) reaches this conclusion in an inductive manner, by generalising over a large set of examples of language change. She does not, however, provide a principled account of why high-frequency items are more liable to change involving mere word forms, even though they are allegedly more entrenched in language users’ minds.
Labov (2007)
change independently originating within a certain variety
The Dutch gender system has been undergoing change for centuries, thereby gradually decreasing the number of exponents of the grammatical three-gender system observed in the oldest documented varieties of the language: while Middle Dutch case inflection of articles, adjectives and nouns themselves revealed whether a given noun was masculine, feminine or neuter, present-day varieties of Dutch have dispensed with most of their adnominal morphology. Thus, case marking has gone and little gender agreement is left (cf. Geerts 1966). The processes of change have unevenly affected different varieties of Dutch. More particularly, they have resulted in massive geographical variation in the domain of gender marking at the level of the dialects (as described most recently in De Schutter et al. 2005), and also in smaller differences between varieties of the standard language.
| two-gender dialect | three-gender dialect |
|---|---|
| common and neuter gender | masculine, feminine, neuter gender |
In correspondence with the conservative nature of their adnominal gender system, southern varieties of Dutch have by and large preserved the traditional system of pronominal reference: anaphoric pronouns may be masculine, feminine and neuter, and are chosen on the basis of a noun’s grammatical gender.
Hence pronominal gender in these varieties differs from northern varieties of Dutch, especially in reference to inanimates, in that the vast majority of pronominal references in the south of the language area are still in line with the triadic distinction between masculine, feminine and neuter nouns (see Geeraerts 1992 for figures). This is no longer the case in areas where two-gender dialects of Dutch are spoken.
| hij | zij | het |
|---|---|---|
| highly individuated words | female persons and animals | hardly individuated forms |
Unlike for adnominal gender, where only two-gender systems are considered part of the standard language, little or no normative pressure exists to adopt a three- or a two-gender grammatical system for pronominal reference (see, e.g., Haeseryn et al. 2002: 161–162).
resemanticisation
Thus the degree to which speakers engage in resemanticisation reflects the transparency of the masculine-feminine distinction in grammar, or, put differently, the frequency with which these speakers are exposed to nonstandard gender agreement markers unambiguously distinguishing masculine and feminine gender (see also Hoppenbrouwers 1983).
Pauwels (1938) discusses the gender of a large number of nouns in Belgian Dutch dialects as documented in the late 19th century, including many East and West Flemish dialects. It appears that all these dialects at the time had preserved the grammatical three-gender system, but there is a lot of variation on the lexical level: nouns that are masculine in one dialect may be feminine or neuter elsewhere. For instance, bos ‘forest’ is masculine in some dialects, but neuter in others; kraag ‘collar’ is feminine in some dialects, masculine in others, etc. Some nouns, like suiker ‘sugar’, can even be masculine, feminine, and neuter, depending on the dialect in which they are used. Since this variation has emerged in the history of Dutch, it appears that nouns may change gender in the course of history (see Geerts 1966 for examples).
methodology
feminine gender
dialect masculine hij > Standard neuter het
dialect / standard Dutch pronoun > resemanticised pronoun
It appears that in the Flemish dialects there is indeed a statistically significant e¤ect to use the neuter pronoun het ‘it’ to refer to mass nouns and abstracts, whether they are grammatically neuter or not: the ratio of het ‘it’ answers is higher for non-neuter mass nouns and abstracts than for non-neuter concrete count nouns: 16.3% (286/1752 answers) vs. 5.5% (98/1792); all nouns neuter in Standard Dutch have been kept out of the analysis. This effect is statistically significant (chi square = 108, d(f ) ¼ 1; p < .001; OR = 3.39).
Examples with stron gremanticisation
Examples of nouns from the questionnaire with a strong tendency towards resemanticisation, i.e. reference with het ‘it’, are:
Examples with weak resemanticisation
As in Standard Dutch, resemanticisation seems to affect pronominal gender only (cf. similar tendencies in other Germanic varieties, as described by Siemund 2002 and Audring 2006). Quite surprisingly, as was already noted in section 3.1, no tendency is observed to extend masculine hij ‘he’ to all concrete count nouns.
The argument is made that this resemanticisation is a spontaneous development, and not a copy from northern Dutch.
| diffusion | imperfect transmission |
|---|---|
| change through (dialect) contact | change that is incrementally implemented by successive generations of language users |
| not applicable to resemanticisation | applicable to resemanticisation |
↳ imperfect transmission
↓
source
The arbitrariness of Dutch gender
These results on Dutch pronominal gender are all the more striking since in other languages in which pronouns agree in gender with their antecedent nouns, the grammatical system appears to be mastered already at a very young age, to the extent that deviations from grammatical gender are extremely rare. Thus, German children of six hardly deviate from a noun’s grammatical gender in pronominal reference (Mills 1986: 92), and the same holds for French-speaking children (Maillart 2003; Van der Velde 2003: 328, 340). This is likely due to the arbitrariness of the Dutch gender system: gender of nouns referring to inanimates is not motivated semantically in Dutch, nor are there any clues in the form of (monomorphemic) nouns that allow to determine gender (Durieux, Daelemans and Gillis 1999). Hence children acquiring Dutch can only derive nouns’ gender from the form of adnominal modifiers and pronouns, not from the form and/or meaning of the noun itself. This situation contrasts sharply with German and French, where gender assignment is at least partly motivated by semantic and/or formal regularities (see, e.g., Mills 1986 and Köpcke and Zubin 1996 on German, and Tucker, Lambert and Rigault 1977 on French). Such regularities minimise memory load, and are well-known to contribute to the acquirability of gender systems (Frigo and McDonald 1998; Gerken, Wilson and Lewis 2005).
More precisely, the acquisition of grammatical gender in pronominal reference should be conceived of as a process of ‘un-learning’ to use semantically motivated pronouns.
Frequency effects recap
The role of frequency in linguistic change has been investigated extensively with respect to phonological change (see, e.g., Hooper 1976; Bybee 1995, 2001; Phillips 1984, 2001, 2006). The relevance of word frequency has been highlighted repeatedly, e.g. by Hooper (1976), who discusses two different frequency effects: on the one hand, processes of phonetic reduction are first visible in highly frequent items, whereas, on the other hand, processes of regularisation typically affect low-frequency items. In a survey of potential frequency effects in grammar, Bybee and Hopper (2001: 10–19) mention several types of frequency effects relating to language change, among which effects boiling down to a tendency in high-frequency patterns to engage in innovations (grammaticalization, lexicalization of multi-word-patterns, formal reduction, . . .), but also conservative effects in high-frequency patterns, such as the retention of certain morphological properties. Phillips (2006: 157) proposes that innovations implemented as speakers memory fails to provide the traditional variant typically affect low frequency items, whereas changes directly involving the production of word forms as stored in memory affect the most frequent words first.
Dialect contact and salience
In addition to playing a role in sound change and other processes of ‘regular’ linguistic change, frequency is found to play a role in dialect contact. Thus, Trudgill (1986: 11–21, 43–53) describes processes of long-term accommodation of one dialect towards another, and observes that salient features are adopted more easily.4 It is rather obvious that, all other properties being equal, highly frequent features are more salient than infrequent features, and thus Trudgill’s observations lead to suggest that frequent items of the donor variety will be easily borrowed by the target variety. According to Phillips (2006: 141), however, such contact-induced changes will only affect high frequency items provided that there are no ‘ideological’ reasons for doing otherwise (cf. also Trudgill 1986: 17–19, 125 on ‘extra-strong salience’) and if the relevant change directly involves the production of the relevant word form.
4 Dialect contact is understood here in a broad sense, i.e. as including contact between dialects and prestige varieties such as Standard Dutch.
Resemanticisation = ideologically free change
↳ also: directly influences the production of word forms
↓
likely frequent items first
It should be noted, however, that this hypothesis is only valid for speakers for whom the dialect is their ‘base variety’, and for whom the standard language is their second dialect. While this situation is common across older speakers in the Dutch language area, recent investigations into dialect usage in younger generations (especially Rys 2007) have revealed that it is nowadays probably more accurate to consider Standard Dutch to be the native dialect of children growing up in Belgium, since the acquisition of the dialect primarily takes place in adolescence, and is characterised by overgeneralisations typically found in second dialect acquisition.
sources for frequency information
| standardisation | resemanticisation |
|---|---|
| adult phenomenon | acquisition phenomenon |
| CGN list | AOA list |
In a way, both the adoption of Standard Dutch gender and the acquisition of a noun’s grammatical gender (which makes the noun less susceptible to resemanticisation) can be described as learning processes. Hence it is expected that the influence of frequency on both phenomena is best described by means of a ‘learning curve’: the first instances of a Standard Dutch noun will contribute stronger to the standardisation process than any succeeding ones, whereas the first instances of a noun will also be more crucial for children to determine the noun’s grammatical gender during acquisition (cf. also Hay and Baayen 2002: 208, who observe that differences amongst lower frequencies often are more salient than equivalent differences amongst higher frequencies). Therefore, rather than testing for correlations between the observed changes and raw frequency data, a logarithmic transformation has been applied on the frequency data (which indeed yields better fits).
In order to investigate the role of frequency, for each word on the questionnaire the strength was calculated with which it is affected by each of the investigated tendencies. For instance, for the noun bos ‘forest’ 92 answers are available from regions where bos is traditionally a masculine noun, whereas it is neuter in Standard Dutch. In 74 cases, the neuter pronoun het ‘it’ was given as an answer. This means that bos ‘forest’ shows a standardisation ratio of 74/92 or 80%. This figure can then be correlated with the frequency data, i.e. both with (the logarithmic transformations of ) the noun’s score on Schaerlaekens, Kohnstamm, and Lejaegere’s (2000) Target Vocabulary List and the noun’s frequency in the Spoken Dutch Corpus.
results
From this it can be concluded that standardisation, at least in gender change, mainly affects highly frequent items: highly frequent items tend to shift towards Standard Dutch gender more easily.
↓
yes to both
Both the table and the scatter plot indicate that items high on the target vocabulary list resist resemanticisation. The very same elements are believed to be acquired early and to be the most frequent items in young children’s speech (Vervoorn 1989: 40, 46; cf. section 4.1). The fact that the target vocabulary list yields much clearer results adds support to the idea that resemanticisation relates to the language acquisition process, providing an extra argument to consider it change through ‘imperfect transmission’.
Significantly, the frequency data from the CGN do not correlate with resemanticisation. This may in part be due to the fact that the investigation only targeted a limited number of nouns, for which corpus frequency and target list score correlate less strongly than for most nouns.
Watch out with the interpretation of “frequency”!
On the basis of the stronger correlations calculated by Vervoorn (1989: 64–65), it can be expected that large-scale investigations will reveal statistically significant correlations between resemanticisation and frequency data drawn from adults (such as CGN frequency). Indeed De Vos (2009) detects clear frequency effects with respect to the proportion of pronominal references in line with grammatical gender, using frequency data from adults rather than children. This, in turn, underscores the poly-interpretability of frequency effects: within the domain of diachronic research, frequency effects may reflect liability on a language pattern’s part to engage in processes of routinization (grammaticalization, phonetic reduction, . . .), different degrees of entrenchment in grammar, different ages of acquisition, etc. Hence researchers should be very explicit on the nature of frequency effects in their data, and on the underlying explanation. In many cases, frequency effects will merely reflect some deeper property of language patterns rather than being a conclusive explanation in their own right.
The data in this chapter are a case in point: during processes of standardisation, frequency effects reflect the intensity with which dialect speakers are exposed to nouns’ standard language gender; in resemanticisation, frequency effects reveal different ages at which nouns are acquired by children, which appears to influence the odds that these nouns’ grammatical gender can be learned successfully.
skipped
Computer simulations of language change notes
This website collects my personal notes on Computer simulations of language change. These notes are provided to bring full transparency to my research process. Of course, since they are only notes, they do not reflect my final thoughts on a topic, and should not be interpreted as such. To read finished papers, please consult my website. Do not use these notes as a basis for your own scientific research. Start from high-quality, peer-reviewed scientific literature instead.